Pairwise comparisons across species are problematic when analyzing functional genomic data
نویسندگان
چکیده
There is considerable interest in comparing functional genomic data across species. One goal of such work is to provide an integrated understanding of genome and phenotype evolution. Most comparative functional genomic studies have relied on multiple pairwise comparisons between species, an approach that does not incorporate information about the evolutionary relationships among species. The statistical problems that arise from not considering these relationships can lead pairwise approaches to the wrong conclusions and are a missed opportunity to learn about biology that can only be understood in an explicit phylogenetic context. Here, we examine two recently published studies that compare gene expression across species with pairwise methods, and find reason to question the original conclusions of both. One study interpreted pairwise comparisons of gene expression as support for the ortholog conjecture, the hypothesis that orthologs tend to have more similar attributes (expression in this case) than paralogs. The other study interpreted pairwise comparisons of embryonic gene expression across distantly related animals as evidence for a distinct evolutionary process that gave rise to phyla. In each study, distinct patterns of pairwise similarity among species were originally interpreted as evidence of particular evolutionary processes, but instead, we find that they reflect species relationships. These reanalyses concretely show the inadequacy of pairwise comparisons for analyzing functional genomic data across species. It will be critical to adopt phylogenetic comparative methods in future functional genomic work. Fortunately, phylogenetic comparative biology is also a rapidly advancing field with many methods that can be directly applied to functional genomic data.
منابع مشابه
COCO-CL: hierarchical clustering of homology relations based on evolutionary correlations
MOTIVATION Determining orthology relations among genes across multiple genomes is an important problem in the post-genomic era. Identifying orthologous genes can not only help predict functional annotations for newly sequenced or poorly characterized genomes, but can also help predict new protein-protein interactions. Unfortunately, determining orthology relation through computational methods i...
متن کاملSoftware tools for analyzing pairwise alignments of long sequences.
Pairwise comparison of long stretches of genomic DNA sequence can identify regions conserved across species, which often indicate functional significance. However, the novel insights frequently must be windowed from a flood of information; for instance, running an alignment program on two 50-kilobase sequences might yield over a hundred pages of alignments. Direct inspection of such a volume of...
متن کاملFunctional Analysis of Iranian Temperature and Precipitation by Using Functional Principal Components Analysis
Extended Abstract. When data are in the form of continuous functions, they may challenge classical methods of data analysis based on arguments in finite dimensional spaces, and therefore need theoretical justification. Infinite dimensionality of spaces that data belong to, leads to major statistical methodologies and new insights for analyzing them, which is called functional data analysis (FDA...
متن کاملA numerical investigation of a reaction-diffusion equation arises from an ecological phenomenon
This paper deals with the numerical solution of a class of reaction diffusion equations arises from ecological phenomena. When two species are introduced into unoccupied habitat, they can spread across the environment as two travelling waves with the wave of the faster reproducer moving ahead of the slower.The mathematical modelling of invasions of species in more complex settings that include ...
متن کاملThe Duplicated Genes Database: Identification and Functional Annotation of Co-Localised Duplicated Genes across Genomes
BACKGROUND There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying gr...
متن کامل